ETO Development Tools 1

home *** CD-ROM | disk | FTP | other *** search

/ ETO Development Tools 1 / ETO Development Tools 1.iso / Essentials / MacApp Documentation / MacApp AppleLink Messages / MacApp.Tech$ 2⁄16⁄90 / 0653-Persistent Guerillas-Feb90 < prev next >

Wrap

Text File | 1990-02-16 | 9.0 KB | 154 lines | [TEXT/GEOL]

Item forwarded by A33 to A34 Item 5735846 12-Feb-90 19:31PST From: D4384 US Voting Mach, Sarner, Calvin,PRT To: MACAPP.TECH$ MacApp Technical cc: D5295 Reseach SW Design, D Goldman,PRT Sub: Persistent Guerillas Needed What follows is a summary of the persistent object problem in MacApp, with some possible solutions, and a proposal for guerilla tactics. Persistent objects may be minimally defined as objects whose state persists across invocations of an application. Traditionally, persistent data on the Mac lives in documents. But there is not yet any general purpose mechanism for treating collections of objects as documents. Such a mechanism would allow opening a document to restore all objects in the document to the state they had when the document was closed, that is the state of being fully instantiated descendents of TObject. Ideally, the methods of the classes being instantiated would also be restored, but this is not possible without more compiler and linker support than MPW provides. Various solutions to the persistent object problem have been discussed here so far, and no doubt many others have been implemented: • Edmund noted that view resources are already a limited solution for descendents of TView. • Larry's Frameworks article presents a partial solution in the form of TStream, so long as there are not multiple references among persistent objects. Also, the entire stream must be read into memory. • Greg (me) presents a method (which Edmund happily calls virtual objects) for swapping persistent objects in and out of memory. • Someone (I've lost the link) describes keeping persistent objects on an Inside/Out database and bringing them into memory as needed. Each of these solutions is appropriate to its intended use, but all fall short of a general solution. A general solution would: 1) Allow a document to be treated as a collection of persistent descendents of TObject. No special code should be needed for object I/O. 2) Allow access to documents larger than available memory, even larger than virtual memory. 3) Allow persistent objects to refer to other persistent objects in arbitrary networks, including cycles. 4) Allow multiple users to have simultaneous access to the same collection of objects. Some ideas for achieving these goals follow. 1) Larry's use of metadata points the way here. The main criticism of his suggestion is that we must write code for each object to support I/O. I have not found this to be so bad, as each object can INHERIT the I/O methods of its ancestors and just add in code for its new fields. But there should be a better way. It seems one solution would be to use the Fields method of each object to drive a general I/O method, as the Fields method already provides information about the types and location of each field of an object. Of course we still need to write a Fields method for each object, but we are already supposed to be doing that (not that I do!). Ideally, the compiler would generate the fields method for us. A second issue is how to save and restore the object metadata. I would suggest that we created a resource type, similar to a view resource, for the purpose of saving the class names and field types of persistent objects. Each object on the data fork could then have in its header the ID of its class name resource. 2) Big documents present big problems. Virtual objects help, but require that at least a handle and a header remain in memory for each persistent object, and also require explicit locking of objects to be sure they are in memory before they are accessed. Even so, I think a general solution will include virtual objects. The idea is to keep a cache of the most active objects in memory, with less active objects being swapped out by writing their data to disk and resizing their handles down to just a small header, including the ClassID. This means that the methods of a virtual object can be dispatched whether or not the object is all in memory, but the data must be swapped in before it is referenced. Note that following the discipline of accessing object fields only through methods can largely eliminate concerns over when to swap, except for those methods themselves. An alternative solution is to keep objects in persistent collections (such as a database) and check objects in and out of the database as needed. This is less flexible than virtual objects, but can allow for arbitrarily large collections of objects. The main restriction is that an object in a collection must be referenced only as a member of that collection, and can be a member of at most one collection at a time. Also, the database must provide support for heterogeneous collections for some typical uses of objects to be effective (such as walking a list of heterogeneous objects in a Draw method). Relational databases explicitly forbid this. An integrated solution would allow both scalar virtual objects and objects which are in fact collections of other objects. MacApp provides only the TList type, whereas Smalltalk and others provide a rich set of collection classes. At the least a flat file, heterogeneous list, and ordered list (BTree) class could be provided, with an interface modeled on TList. 3) References among objects are a pain, especially arbitrary networks of references. The view architecture finesses this one by requiring views to form a hierarchy that can be traversed from the root, so that references to superviews and subviews can be resolved with finite searches. A more general solution proposed by Larry is to give each persistent object an ID (which might be their offset in the document file), refer to objects by ID, and maintain a translation table of ID's to handles. Then each ID reference can be resolved at runtime into a handle to the object. With a virtual object scheme, this table might be created and used only while the document is being opened, and could then be disposed of. The main problem for IDs is unresolved references. When an object is first read in its referee's IDs might not yet be in the translation table. So we might have to keep a list of unresolved references as we first traverse the document, then make a pass of this list to finally resolve them. If virtual object IDs are implemented as file offsets then they can be resolved as they are encountered. This works OK for trees and acyclic graphs, but could cause infinite regress in the presence of cycles. Perhaps one bit of the ID caould serve to indicate whether a reference is a file offset or a resolved handle, and thus stop the regress. The persistent collection approach allows references only to the values of objects, or their location in a collection, but never to their identity. This finesses the problem by simply not allowing direct references among objects. 4) Multi-user access presents all the problems of data integrity and deadlock of any distributed database. Implementing persistent collections on top of an existing database is by far the easiest solution. For virtual objects, the memory locking routines could be combined with a file range locking protocol to allow multi-user access. This would still require the programmer to prevent deadlock and maintain data integrity by careful design of access protocols (i.e. two-phase transaction locking). A really sophisticated lock manager process might be able to prevent deadlock, or at least detect it once underway, but this is usually too hard to do in general. In summary then, it appears that the problem of persistent objects is in need of a general solution, or class of solutions. I believe guerilla action is called for. At the least, a simple subclass of TObject could be provided which would standardize object I/O and metadata resources. Perhaps a clipboard type could be provided as well. Beyond that, a virtual object capability could also be easily provided, preferably one that could resolve multiple references. This much alone would allow a persistent document type to be defined which could be opened by any application which defined the necessary classes. For classes which are not defined by the application at least some methods could still be supported, at least enough to make copies and extract field data. The problems of multiple users and large collections are much harder. At least some abstract classes could be defined for interfacing to collections in databases, with access to particular databases provided by subclassing. Some simple concrete implementations would be nice, although they would probably not meet the performance needs of really big jobs (like CD-ROM). An abstract class approach might also be taken for multi-user access to virtual objects. In this way a framework could be provided for easy sharing of data among applications, with the most difficult, preformance critical, and application specific issues left to the capitalists to pay for. If you have read this far I take it you are interested, so where do think we should go from here? Yours truly, Greg Colvin